NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A Multimodal Spatio-Temporal GCN Model with Enhancements for Isolated Sign Recognition

Zhou, Yang; Xia, Zhaoyang; Chen, Yuxiao; Neidle, Carol; Metaxas, Dimitris (May 2024, Proceedings of the LREC-COLING 2024 11th Workshop on the Representation and Processing of Sign Languages: Evaluation of Sign Language Resources)
Efthimiou, Eleni; Fotinea, Stavroula-Evita; Hanke, Thomas; Hochgesang, Julie A; Mesch, Johanna; Schulder, Marc (Ed.)
We propose a multimodal network using skeletons and handshapes as input to recognize individual signs and detect their boundaries in American Sign Language (ASL) videos. Our method integrates a spatio-temporal Graph Convolutional Network (GCN) architecture to estimate human skeleton keypoints; it uses a late-fusion approach for both forward and backward processing of video streams. Our (core) method is designed for the extraction---and analysis of features from---ASL videos, to enhance accuracy and efficiency of recognition of individual signs. A Gating module based on per-channel multi-layer convolutions is employed to evaluate significant frames for recognition of isolated signs. Additionally, an auxiliary multimodal branch network, integrated with a transformer, is designed to estimate the linguistic start and end frames of an isolated sign within a video clip. We evaluated performance of our approach on multiple datasets that include isolated, citation-form signs and signs pre-segmented from continuous signing based on linguistic annotations of start and end points of signs within sentences. We have achieved very promising results when using both types of sign videos combined for training, with overall sign recognition accuracy of 80.8% Top-1 and 95.2% Top-5 for citation-form signs, and 80.4% Top-1 and 93.0% Top-5 for signs pre-segmented from continuous signing.
more » « less
Full Text Available
Distributionally Robust Decision Making Leveraging Conditional Distributions

Chen, Yuxiao; Kim, Jip; Anderson, James (December 2022, Proceedings of the IEEE 61st Conference on Decision and Control (CDC))

Distributionally robust optimization (DRO) is a powerful tool for decision making under uncertainty. It is particularly appealing because of its ability to leverage existing data. However, many practical problems call for decision- making with some auxiliary information, and DRO in the context of conditional distributions is not straightforward. We propose a conditional kernel distributionally robust optimiza- tion (CKDRO) method that enables robust decision making under conditional distributions through kernel DRO and the conditional mean operator in the reproducing kernel Hilbert space (RKHS). In particular, we consider problems where there is a correlation between the unknown variable y and an auxiliary observable variable x. Given past data of the two variables and a queried auxiliary variable, CKDRO represents the conditional distribution P(y|x) as the conditional mean operator in the RKHS space and quantifies the ambiguity set in the RKHS as well, which depends on the size of the dataset as well as the query point. To justify the use of RKHS, we demonstrate that the ambiguity set defined in RKHS can be viewed as a ball under a metric that is similar to the Wasserstein metric. The DRO is then dualized and solved via a finite dimensional convex program. The proposed CKDRO approach is applied to a generation scheduling problem and shows that the result of CKDRO is superior to common benchmarks in terms of quality and robustness.
more » « less
Full Text Available
Sign Language Video Anonymization

Xia, Zhaoyang; Chen, Yuxiao; Zhangli, Qilong; Huenerfauth, Matt; Neidle, Carol; Metaxas, Dimitris (June 2022, LREC2022 10th Workshop on the Representation and Processing of Sign Languages: Multilingual Sign Language Resources)

Deaf signers who wish to communicate in their native language frequently share videos on the Web. However, videos cannot preserve privacy—as is often desirable for discussion of sensitive topics—since both hands and face convey critical linguistic information and therefore cannot be obscured without degrading communication. Deaf signers have expressed interest in video anonymization that would preserve linguistic content. However, attempts to develop such technology have thus far shown limited success. We are developing a new method for such anonymization, with input from ASL signers. We modify a motion-based image animation model to generate high-resolution videos with the signer identity changed, but with preservation of linguistically significant motions and facial expressions. An asymmetric encoder-decoder structured image generator is used to generate the high-resolution target frame from the low-resolution source frame based on the optical flow and confidence map. We explicitly guide the model to attain clear generation of hands and face by using bounding boxes to improve the loss computation. FID and KID scores are used for evaluation of the realism of the generated frames. This technology shows great potential for practical applications to benefit deaf signers.
more » « less
Full Text Available
Sign Language Video Anonymization

Xia, Zhaoyang; Chen, Yuxiao; Zhangli, Qilong; Huenerfauth, Matt; Neidle, Carol; Metaxas, Dimitris (June 2022, Proceedings of the LREC2022 10th Workshop on the Representation and Processing of Sign Languages: Multilingual Sign Language Resources, Marseille, France, 25 June 2022)

Deaf signers who wish to communicate in their native language frequently share videos on the Web. However, videos cannot preserve privacy—as is often desirable for discussion of sensitive topics—since both hands and face convey critical linguistic information and therefore cannot be obscured without degrading communication. Deaf signers have expressed interest in video anonymization that would preserve linguistic content. However, attempts to develop such technology have thus far shown limited success. We are developing a new method for such anonymization, with input from ASL signers. We modify a motion-based image animation model to generate high-resolution videos with the signer identity changed, but with preservation of linguistically significant motions and facial expressions. An asymmetric encoder-decoder structured image generator is used to generate the high-resolution target frame from the low-resolution source frame based on the optical flow and confidence map. We explicitly guide the model to attain clear generation of hands and face by using bounding boxes to improve the loss computation. FID and KID scores are used for evaluation of the realism of the generated frames. This technology shows great potential for practical applications to benefit deaf signers.
more » « less
Full Text Available
Onboard Safety Guarantees for Racing Drones: High-Speed Geofencing With Control Barrier Functions

https://doi.org/10.1109/LRA.2022.3144777

Singletary, Andrew; Swann, Aiden; Chen, Yuxiao; Ames, Aaron D. (April 2022, IEEE Robotics and Automation Letters)

Full Text Available
Robust Disturbance Rejection for Robotic Bipedal Walking: System-Level-Synthesis with Step-to-step Dynamics Approximation

https://doi.org/10.1109/CDC45484.2021.9683065

Xiong, Xiaobin; Chen, Yuxiao; Ames, Aaron D. (December 2021, 60th IEEE Conference on Decision and Control (CDC))

Full Text Available
Sign Language Video Anonymization

Xia, Zhaoyang; Chen, Yuxiao; Zhangli, Qilong; Huenerfauth, Matt; Neidle, Carol; Metaxas, Dimitri (January 2022, Proceedings of the LREC2022 10th Workshop on the Representation and Processing of Sign Languages: Multilingual Sign Language Resources)

Full Text Available
The Mixed-Observable Constrained Linear Quadratic Regulator Problem: the Exact Solution and Practical Algorithms

https://doi.org/10.1109/TAC.2022.3210871

Rosolia, Ugo; Chen, Yuxiao; Daftry, Shreyansh; Ono, Masahiro; Yue, Yisong; Ames, Aaron D. (January 2022, IEEE Transactions on Automatic Control)

Full Text Available
Decentralized Task and Path Planning for Multi-Robot Systems

https://doi.org/10.1109/LRA.2021.3068103

Chen, Yuxiao; Rosolia, Ugo; Ames, Aaron D. (July 2021, IEEE Robotics and Automation Letters)
null (Ed.)
Full Text Available
Guaranteed Obstacle Avoidance for Multi-Robot Operations With Limited Actuation: A Control Barrier Function Approach

https://doi.org/10.1109/LCSYS.2020.3000748

Chen, Yuxiao; Singletary, Andrew; Ames, Aaron D. (January 2021, IEEE Control Systems Letters)
null (Ed.)
Full Text Available

« Prev Next »

Search for: All records